Detecting web attacks using random undersampling and ensemble learners

نویسندگان

چکیده

Abstract Class imbalance is an important consideration for cybersecurity and machine learning. We explore classification performance in detecting web attacks the recent CSE-CIC-IDS2018 dataset. This study considers a total of eight random undersampling (RUS) ratios: no sampling, 999:1, 99:1, 95:5, 9:1, 3:1, 65:35, 1:1. Additionally, seven different classifiers are employed: Decision Tree (DT), Random Forest (RF), CatBoost (CB), LightGBM (LGB), XGBoost (XGB), Naive Bayes (NB), Logistic Regression (LR). For metrics, Area Under Receiver Operating Characteristic Curve (AUC) Precision-Recall (AUPRC) both utilized to answer following three research questions. The first question asks: “Are various ratios statistically from each other attacks?” second And, our third “Is interaction between significant Based on experiments, answers all questions “Yes”. To best knowledge, we apply techniques dataset while exploring sampling ratios.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Detecting Guessed and Random Learners' Answers through Their Brainwaves

This paper describes an experiment in which we tried to predict the learner’s answers from his brainwaves. We discuss the efficiency to enrich the learner model with some electrical brain metrics to obtain some important information about the learner during a test. We conducted an experiment to reach three objectives: the first one is to record the learner brainwaves and his answers to the test...

متن کامل

Data Dependant Learners Ensemble Pruning

Ensemble learning aims at combining several slightly different learners to construct stronger learner. Ensemble of a well selected subset of learners would outperform than ensemble of all. However, the well studied accuracy / diversity ensemble pruning framework would lead to over fit of training data, which results a target learner of relatively low generalization ability. We propose to ensemb...

متن کامل

Learning from Imbalanced Data Using Ensemble Methods and Cluster-Based Undersampling

Imbalanced data, where the number of instances of one class is much higher than the others, are frequent in many domains such as fraud detection, telecommunications management, oil spill detection and text classification. Traditional classifiers do not perform well when considering data that are susceptible to both within-class and between-class imbalances. In this paper, we propose the ClustFi...

متن کامل

Detecting Web Attacks with End-to-End Deep Learning

Web applications are popular targets for cyber-attacks because they are network accessible and often contain vulnerabilities. An intrusion detection system monitors web applications and issues alerts when an attack attempt is detected. Existing implementations of intrusion detection systems usually extract features from network packets or string characteristics of input that are manually select...

متن کامل

A Lightweight Tool for Detecting Web Server Attacks

We present an intrusion-detection tool aimed at protecting web servers, and justify why such a tool is needed. We describe several interesting features, such as the ability to run in real time and to keep track of suspicious hosts. The design is flexible and the signatures used to detect malicious behavior are not limited to simple pattern matching of dangerous cgi scripts. The tool includes me...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Journal of Big Data

سال: 2021

ISSN: ['2196-1115']

DOI: https://doi.org/10.1186/s40537-021-00460-8